305 research outputs found

    The Role of Provenance Management in Accelerating the Rate of Astronomical Research

    Get PDF
    The availability of vast quantities of data through electronic archives has transformed astronomical research. It has also enabled the creation of new products, models and simulations, often from distributed input data and models, that are themselves made electronically available. These products will only provide maximal long-term value to astronomers when accompanied by records of their provenance; that is, records of the data and processes used in the creation of such products. We use the creation of image mosaics with the Montage grid-enabled mosaic engine to emphasize the necessity of provenance management and to understand the science requirements that higher-level products impose on provenance management technologies. We describe experiments with one technology, the "Provenance Aware Service Oriented Architecture" (PASOA), that stores provenance information at each step in the computation of a mosaic. The results inform the technical specifications of provenance management systems, including the need for extensible systems built on common standards. Finally, we describe examples of provenance management technology emerging from the fields of geophysics and oceanography that have applicability to astronomy applications.Comment: 8 pages, 1 figure; Proceedings of Science, 201

    A Virtual Data Grid for LIGO

    Get PDF
    GriPhyN (Grid Physics Network) is a large US collaboration to build grid services for large physics experiments, one of which is LIGO, a gravitational-wave observatory. This paper explains the physics and computing challenges of LIGO, and the tools that GriPhyN will build to address them. A key component needed to implement the data pipeline is a virtual data service; a system to dynamically create data products requested during the various stages. The data could possibly be already processed in a certain way, it may be in a file on a storage system, it may be cached, or it may need to be created through computation. The full elaboration of this system will al-low complex data pipelines to be set up as virtual data objects, with existing data being transformed in diverse ways

    The Application of Cloud Computing to the Creation of Image Mosaics and Management of Their Provenance

    Get PDF
    We have used the Montage image mosaic engine to investigate the cost and performance of processing images on the Amazon EC2 cloud, and to inform the requirements that higher-level products impose on provenance management technologies. We will present a detailed comparison of the performance of Montage on the cloud and on the Abe high performance cluster at the National Center for Supercomputing Applications (NCSA). Because Montage generates many intermediate products, we have used it to understand the science requirements that higher-level products impose on provenance management technologies. We describe experiments with provenance management technologies such as the "Provenance Aware Service Oriented Architecture" (PASOA).Comment: 15 pages, 3 figur

    Understanding User Behavior: From HPC to HTC

    Get PDF
    AbstractIn this paper, we investigate the differences and similarities in user job submission behavior in High Performance Computing (HPC) and High Throughput Computing (HTC). We consider job submission behavior in terms of parallel batch-wise submissions, as well as delays and pauses in job submission. Our findings show that modeling user-based HTC job submission behavior requires knowledge of the underlying bags of tasks, which is often unavailable. Furthermore, we find evidence that subsequent job submission behavior is not influenced by the different complexities and requirements of HPC and HTC jobs
    corecore